SIGMA-GEN: Structure and Identity Guided Multi-subject Assembly for Image Generation

1 University of Massachusetts Amherst
2 Adobe Research

ICLR 2026

SIGMA-GEN enhances controllability of text-to-image workflows by allowing users to prescribe both structure and subject identity.

Abstract

We present SIGMA-GEN, a unified framework for multi-identity preserving image generation. Unlike prior approaches, SIGMA-GEN is the first to enable single-pass multi-subject identity-preserved generation guided by both structural and spatial constraints. A key strength of our method is its ability to support user guidance at various levels of precision — from coarse 2D or 3D boxes to pixel-level segmentations and depth — with a single model. To enable this, we introduce SIGMA-SET27K, a novel synthetic dataset that provides identity, structure, and spatial information for over 100k unique subjects across 27k images. Through extensive evaluation we demonstrate that SIGMA-GEN achieves state-of-the-art performance in identity preservation, image generation quality, and speed.

Unified control modality
SIGMA-GEN enables unified control over image generation at varying levels of granularity including 2D, 3D boxes, 3D objects with a single model.

BibTeX

@article{saha2025sigma,
  title={SIGMA-GEN: Structure and Identity Guided Multi-subject Assembly for Image Generation},
  author={Saha, Oindrila and Krs, Vojtech and Mech, Radomir and Maji, Subhransu and Blackburn-Matzen, Kevin and Gadelha, Matheus},
  journal={arXiv preprint arXiv:2510.06469},
  year={2025}
}